0%

Introduction to Time Series Analysis - 02

Introduction to Time Series Analysis - 02

This note is for course MATH 545 at McGill University.

Lecture 4 - Lecture 6

We recommend an R package named “forecast”.


First order Autoregressive process AR(1)

Assume {Xt}\{X_t\} is a sequence of random variables that is stationary, satisfying Xt=ϕXt1+ZtX_t = \phi X_{t-1} + Z_t, for t=0,±1,±2,...t=0, \pm 1, \pm 2, ..., where {Zt}\{Z_t\} is a White Noise process of WN(0,σ2)WN(0, \sigma^2), and {Zt}\{Z_t\} is uncorrelated with {Xt}\{X_t\} for s<t\forall s<t, and ϕ\phi is a real-valued constant.

(Graphical representation will be added later)

We have E(Xt)=ϕE(Xt1)+E(Zt)=ϕμX+0E(X_t) = \phi E(X_{t-1}) + E(Z_t)=\phi \mu_X + 0.

By assuming {Xt}\{X_t\} is stationary, we need μX=0\mu_X = 0 (or ϕ=1\phi=1 or ϕ=0\phi=0)

By construction, (Xth)Xt=(Xth)(Xt1+Zt)(X_{t-h})X_t = (X_{t-h})(X_{t-1} + Z_t)

Take expectation, E(XthXt)=E(Xth(Xt1+Zt))E(X_{t-h}X_t) = E(X_{t-h}(X_{t-1} + Z_t))

E(XthXt)=ϕE(XthXt1)+E(XthZt)=ϕE(XthXt1)E(X_{t-h}X_t)=\phi E(X_{t-h}X_{t-1})+E(X_{t-h}Z_{t})=\phi E(X_{t-h}X_{t-1})

Then we have γX(h)=ϕγX(h1)=ϕ[ϕγX(h2)]=...=ϕhγX(0)\gamma_X(h) = \phi \gamma_X(h-1)=\phi[\phi \gamma_X(h-2)]=...=\phi^h \gamma_X(0), and ρX(h)=γX(h)γX(0)=ϕh\rho_X(h) = \frac{\gamma_X(h)}{\gamma_X(0)}=\phi^h.

By symmetry and stationary, we have ρX(h)=ϕh\rho_X(h)=\phi^{|h|}

γX(0)=Cov(Xt,Xt)=E((ϕXt1+Zt)(ϕXt1+Zt))=ϕ2E(Xt12)+E(Zt2)=ϕ2γX(0)+σ2\gamma_X(0)=Cov(X_t, X_t) \\= E((\phi X_{t-1} + Z_t)(\phi X_{t-1} + Z_t))\\=\phi^2 E(X_{t-1}^2) + E(Z_t^2) \\= \phi^2 \gamma_X(0) + \sigma^2 (as Xt1X_{t-1} and ZtZ_t are not correlated by definition)

Therefore, we have γX(0)=σ21ϕ2\gamma_X(0) = \frac{\sigma^2}{1-\phi^2} with ϕ<1|\phi|<1.

Estimating Autocorrelation

Let X1,...,XnX_1, ..., X_n be observed values for a stationary sequence, and sample mean Xˉ=1ni=1nXi\bar{X}=\frac{1}{n}\sum^n_{i=1}X_i.

We have covariance Cov(V,W)=E[(VE(V))(WE(W))]Cov(V,W)=E[(V-E(V))(W-E(W))], and the unbiased estimator of covariance as Cov^(V,W)=i=1n(ViVˉ)(WiWˉ)n1\hat{Cov}(V,W)=\frac{\sum_{i=1}^n(V_i-\bar{V})(W_i-\bar{W})}{n-1}

γX(h)=Cov(Xt+h,Xt)\gamma_X(h)=Cov(X_{t+h}, X_t)

γ^X(h)=1nt=1nh(Xt+hXˉ)(XtXˉ)\hat{\gamma}_X(h)=\frac{1}{n}\sum_{t=1}^{n-|h|}(X_{t+|h|}-\bar{X})(X_t-\bar{X}) for n<h<n-n<h<n

The sample autocorrelation ρ^(h)=γ^X(h)γX(0)\hat{\rho}(h)=\frac{\hat{\gamma}_X(h)}{\gamma_X(0)}, where γX(0)=1nt=1n(XtXˉ)2\gamma_X(0)=\frac{1}{n}\sum^n_{t=1}(X_t-\bar{X})^2.

Classical Decomposition Model

Xt=mt+St+YtX_t = m_t + S_t + Y_t

mtm_t here shows “trend”, StS_t shows seasonal patterns, and YtY_t is random “noise” component (so far we have 4 choices of noise: iid, white noise, MA(1), and AR(1))

We can remove mtm_t and StS_t to estimate YtY_t, and we have two ways:

  1. estimate trend/seasonal using a “model” (filter)
  2. differencing {Xt}\{X_t\} to estimate trend and seasonality (filter)
Estimate Trend

Xt=mt+Yt,t=1,...,nX_t=m_t+Y_t, \quad t=1, ..., n, and E(Yt)=0E(Y_t)=0

We can use Nonparametric methods, which is flexible and with fewer assumptions, but is subjective

  1. Finite Moving Average Filter (to capture local trend)

Wt=12q+1j=qqXtj=12q+1j=qq(mtj+Ytj)=12q+1[j=qqmtj]+12q+1[j=qqYtj]12q+1j=qqmtjW_t=\frac{1}{2q+1}\sum_{j=-q}^{q}X_{t-j} \\=\frac{1}{2q+1}\sum_{j=-q}^{q}(m_{t-j}+Y_{t-j}) \\=\frac{1}{2q+1}[\sum_{j=-q}^{q}m_{t-j}] + \frac{1}{2q+1}[\sum_{j=-q}^{q}Y_{t-j}] \\ \approx \frac{1}{2q+1}\sum_{j=-q}^{q}m_{t-j} where qq is a positive integer

Moving Average is a linear filter where aj={12q+1,forjq0,otherwisea_j=\begin{cases}\frac{1}{2q+1}, \text{for} |j| \leq q \\ 0, \text{otherwise}\end{cases}

Our goal is Xtm^t=Y^tX_t-\hat{m}_t = \hat{Y}_t

  1. Exponential smoothing

m^t=αXt+(1α)m^t1\hat{m}_t=\alpha X_t + (1-\alpha)\hat{m}_{t-1}

For t=1t=1, we have m^1=X1\hat{m}_1=X_1. For t2t\geq 2, m^t=j=0t2α(1α)jXtj+(1α)t1X1\hat{m}_t=\sum_{j=0}^{t-2} \alpha(1-\alpha)^j X_{t-j} + (1-\alpha)^{t-1}X_1.

  1. Parametric smoothing (linear, polynomial, basic function b-spline)
  2. High-frequency smoothing using Fourier Series
Differencing(for trend)

We define the lag-1 difference as Xt=XtXt1=(1B)Xt\nabla X_t = X_t - X_{t-1} = (1-B)X_t, where BB is known as the backwards shift operator with BXt=Xt1BX_t = X_{t-1}

We can generalize \nabla and BB to general lags by taking powers:

BjXt=Bj1(BXt)=Bj1Xt1=...=XtjB^j X_t = B^{j-1} (BX_t) = B^{j-1}X_{t-1} = ... = X_{t-j}

XtXtj=(1Bj)XtX_t - X_{t-j} = (1-B^j)X_t

As jXt=(j1Xt)\nabla^jX_t = \nabla(\nabla^{j-1}X_t), we have:

2Xt=(Xt)=((1B)Xt)=(1B)(1B)Xt=(12B+B2)Xt=Xt2BXt+B2Xt=Xt2Xt1+Xt2=(XtXt1)(Xt1Xt2)\nabla^2X_t = \nabla(\nabla X_t) = \nabla((1-B)X_t) = (1_B)(1-B)X_t \\=(1-2B+B^2)X_t = X_t -2BX_t + B^2X_t \\=X_t-2X_{t-1}+X_{t-2}\\=(X_t-X_{t-1}) - (X_{t-1}-X_{t-2})

Let Xt=mt+YtX_t = m_t+Y_t, where mt=a+btm_t = a+bt, we have

Xt=(mt+Yt)=mt+Yt=mtmt1+YtYt1=(a+bt)+(a+b(t1))+YtYt1=b+YtYt1\nabla X_t = \nabla(m_t+Y_t) = \nabla m_t + \nabla Y_t \\=m_t-m_{t-1} + Y_t -Y_{t-1} = (a+bt) + (a+b(t-1)) + Y_t-Y_{t-1} \\=b+Y_t-Y_{t-1}

Therefore we could say Xt\nabla X_t will be stationary if YtYt1Y_t-Y_{t-1} is stationary.

Estimate seasonal component

An example of d=4d=4

k=1k=1 k=2k=2 k=3k=3 k=4k=4
x~1\tilde{x}_{1} x~2\tilde{x}_{2} x~3\tilde{x}_{3} x~4\tilde{x}_{4} j=0\rightarrow j=0
x~5\tilde{x}_{5} x~6\tilde{x}_{6} x~7\tilde{x}_{7} x~8\tilde{x}_{8} j=1\rightarrow j=1
x~9\tilde{x}_{9} x~10\tilde{x}_{10} x~11\tilde{x}_{11} x~12\tilde{x}_{12} j=2\rightarrow j=2
\downarrow \downarrow \downarrow \downarrow
S1S_1 S2S_2 S3S_3 S4S_4

Wk=j=1t/d1(Xk+jdm^k+jd)W_k=\sum_{j=1}^{t/d-1}(X_{k+jd} - \hat{m}_{k+jd})

S^k=Wk1di=1dWi\hat{S}_k = W_k - \frac{1}{d}\sum_{i=1}^d W_i

Let dt=XtS^td_t = X_t - \hat{S}_t as deseasonal data, we can reestimate the trend from dtd_t and m~t\tilde{m}_t by Y^t=XtS^tm~t\hat{Y}_t = X_t - \hat{S}_t - \tilde{m}_t

Differencing (for seasonal)

dXt=XtXtd=(1Bd)Xt\nabla_d X_t = X_t - X_{t-d} = (1-B^d)X_t

Apply this to Xt=mt+St+YtX_t=m_t+S_t+Y_t, we have:

dXt=d(mt+St+Yt)=(mtmtd)+(StStd)+(YtYtd)\nabla_d X_t = \nabla_d(m_t + S_t+Y_t) \\=(m_t - m_{t-d}) + (S_t - S_{t-d}) + (Y_t - Y_{t-d})

Therefore X~t=(mtmtd)+(YtYtd)\tilde{X}_t=(m_t - m_{t-d}) + (Y_t - Y_{t-d})

If mt=a+btm_t=a+bt, X~t=((a+bt)(a+b(td)))+(YtYtd)=bd+(YtYtd)\tilde{X}_t = ((a+bt)-(a+b(t-d))) + (Y_t-Y_{t-d}) = bd+ (Y_t-Y_{t-d})

If Yt iid (0,σ2)Y_{t} \stackrel{\text { iid }}{\sim}\left(0, \sigma^{2}\right), we have \hat{p}(h) \stackrel{\text { · }}{\sim} N\left(0, \frac{1}{n}\right). (No proof)

(Recall that p^(h)=γ^(h)γ^(0)=i=1nh(XiXˉ)(Xi+nXˉ)/ni=1n(XiXˉ)2/n\hat{p}(h)=\frac{\hat{\gamma}(h)}{\hat{\gamma}(0)} = \frac{\sum_{i=1}^{n-|h|} (X_i-\bar{X})(X_{i+n}-\bar{X})/n}{\sum_{i=1}^n (X_i-\bar{X})^2/n})